Variation in heat shock protein 40 kDa relates to divergence in thermotolerance among cryptic rotifer species

Genetic divergence and the frequency of hybridization are central for defining species delimitations, especially among cryptic species where morphological differences are merely absent. Rotifers are known for their high cryptic diversity and therefore are ideal model organisms to investigate such patterns. Here, we used the recently resolved Brachionus calyciflorus species complex to investigate whether previously observed between species differences in thermotolerance and gene expression are also reflected in their genomic footprint. We identified a Heat Shock Protein gene (HSP 40 kDa) which exhibits cross species pronounced sequence variation. This gene exhibits species-specific fixed sites, alleles, and sites putatively under positive selection. These sites are located in protein binding regions involved in chaperoning and may therefore reflect adaptive diversification. By comparing three genetic markers (ITS, COI, HSP 40 kDa), we revealed hybridization events between the cryptic species. The low frequency of introgressive haplotypes/alleles suggest a tight, but not fully impermeable boundary between the cryptic species.

. Twenty-seven of these genes were assigned to the GO term GO:0006950 (response to stress), of which one gene was annotated to "response to heat" (GO:00009408). This gene (i.e., HSP 40 kDa) was differentially expressed in B. calyciflorus s.s. in cultures grown after 4 h of heat exposure to different temperature (20 °C vs. 32 °C and 20 °C vs. 26 °C), while it did not show a temperature-specific expression in B. fernandoi (Fig. 1). HSP 40 kDa sequence diversity. The sequence diversity of all five Brachionus species (B. calyciflorus s.s., B. fernandoi, B. rubens, B. angularis, and B. diversicornis) was assessed by comparing HSP 40 kDa and Cytochrome Oxidase Subunit I (COI) genes. We detected an overall higher nucleotide divergence in the mitochondrial COI compared to the HSP 40 kDa gene fragment (Fig. 2). For the HSP 40 kDa sequences, the lowest nucleotide divergence was detected between B. angularis and B. diversicornis (0.103), followed by B. calyciflorus s.s. and B. fernandoi (0.121). The highest nucleotide divergence was found between B. fernandoi and B. rubens (0.198 Table 3). Out of these 11 fixed sites, 9 amino acid substitutions found in B. fernandoi are considered benign in our Polyphen2 analysis, while two (site 63 and 66) may change protein function (inferred "possibly damaging", Table 3). Additionally, among the six positively selected sites, two amino acid substitutions may influence protein function (inferred "possibly damaging", Table 3). The inference of character polarity (ancestral vs. derived) showed a prevalence of B. calyciflorus s.s. to retain the ancestral variant (10/17). For position 292, the status ancestral or derived could not be identified with certainty (Table 3).
We inferred a divergence time of 25-29 Million years (Myr) between B. calyciflorus s.s. and B. fernandoi, based on mitochondrial cds, while the split between the freshwater and the marine species of Brachionus was estimated to be in a range of 41-47 Myr ( Supplementary Fig. s8).

Discussion
Candidate gene selection pipeline. The utilization of available transcriptomes and different expression patterns of B. calyciflorus s.s. and B. fernandoi allowed us to identify a candidate gene which likely plays a role in the differences in temperature adaptation among the two Brachionus species. Indeed, HSPs have been shown to be involved in the response to various environmental stressors such as thermal stress, toxins, oxidative conditions [62][63][64] and aging 61 . The importance of HSPs in the process of thermotolerance for invertebrates, particularly HSP 40 kDa, HSP 60 kDa and HSP 70 kDa, was previously shown 51 Table 1. Likelihood ratio tests of seven different site models performed on 296 amino acids (888 bp) of the HSP 40 kDa, including 29 of the 32 unique alleles (A11, A12 and A18 were excluded from the analysis, as they only occurred in specimens affected by hybridization/introgression). Underlying phylogeny is based on the ITS1 (532 bp) of B. calyciflorus s.s., B. fernandoi, B. angularis, B. diversicornis and B. rubens. For each likelihood ratio test (LRT), we provide log likelihood (l) values of the compared tests (2∆l), degrees of freedom (df), p-values, estimated average ratio of non-synonymous vs. synonymous substitutions (ω), and the sites inferred to be under positive selection by Bayes Empirical Bayes (BEB) analysis, labeled with an * for a posterior probability of > 95% and ** for > 99%. www.nature.com/scientificreports/ divergent evolution in the HSP 40 kDa. All but one allele, which originated from descendants of a hybrid, were species-specific. The high number of alleles and expressed structural differences (27 unique protein sequences) within and between the two species might be related to their cyclical parthenogenesis 65 . The dominant asexual mode of reproduction via mitotic eggs from amictic females (i.e., clones) and a less frequent switch to sexual reproduction together with a short generation time 66 strongly reduces genomic recombination, making purifying selection to eliminate slightly deleterious mutations less effective for those organisms, a phenomenon known as Muller's ratchet [67][68][69] .
The comparison between all five analysed species revealed that the species from the B. calyciflorus species complex are more similar (based on nucleotide divergence of the HSP 40 kDa and COI species-wise consensus sequences) to one another than to representatives from outside that species complex (B. rubens, B. angularis and B. diversicornis), with an overall higher nucleotide diversity in the COI. The higher diversity in the mitochondrial COI compared to nuclear HSP 40 kDa gene was expected as mitochondrial DNA in general exhibits a higher mutation rate than nuclear genes 70 , a phenomenon observed as well in rotifers 71 . While HSP 40 kDa gene divergence among Brachionus species generally resembled their phylogenetic affinity, the dN/dS ratio is the smallest, when comparing the not directly related species B. diversicornis and B. calyciflorus s.s. This may have arisen from similar environmental parameters that lead to similar selection pressures 72 . Indeed, a recent study found the occurrence of B. diversicornis to be correlated with warmer water temperatures 73 , which could indicate a higher thermotolerance-as observed in B. calyciflorus s.s. 35,36 -also in this species.
Selection scenario of the HSP 40 kDa. Within the HSP 40KDa gene, six amino acid sites were inferred to be under positive selection. A closer inspection of those sites revealed that four of them are in functional domains of the gene. One site is located in the J-domain, a region known to be important in the chaperone activity of HSP 40 kDa and the HSP 70 kDa by interacting with the ATP domain of the HSP 70kDa 74 , which accelerates ATP hydrolysis 75,76 and thus, regulates the ATPase activity 77 . Three other sites are located in the C-terminal domain of the HSP 40 kDa. Although the exact function of these sites is still undetermined, it is known that the C-terminal domain of yeast Saccharomyces cerevisiae contains a peptide binding site 78,79 and is therefore enabled to interact with other proteins. Changes in amino acid composition at these sites can affect folding patterns and thus binding with other proteins (e.g., HSP 70 kDa) through suboptimal structures or polarities 80,81 . These suboptimal structures can affect the strength of HSP 70 kDa ATPase regulation and potentially alter an organism's Table 3. Prediction of impact of non-synonymous (fixed or under selection cf. Table 1)  www.nature.com/scientificreports/ downstream stress response, which is underlined by the predicted impact of amino acid substitutions on protein function at two of the six sites. However, among the inferred sites under positive selection, only one (292, cf. Supplementary Fig. s4) exhibited species-specific amino acids. One explanation could be that the large geographic range of our study correlated with different local selection pressures, such that most substitutions were not fixed among the entire distribution range of a species. Another explanation could be a low genomic recombination rate (Muller's ratchet), whereby clonal organisms easily accumulate slightly deleterious mutations that can generate noise in the selection analysis, hindering the identification of species-specific sites under selection 82 . The latter is in line with the findings of the McDonald-Kreitman Test (MKT) where a high Neutrality Index of 12.05 was found between the unique alleles of B. calyciflorus s.s. and B. fernandoi. This indicates an overabundance of replacement-site polymorphisms, which can arise from slightly deleterious polymorphic variants not eliminated by purifying selection 83 , a phenomenon known from clonal organisms 84 . Because of the potentially compromised power of selection analyses in clonal organisms 82 , we inspected all further amino acid substitutions as well. Indeed, among them eleven fixed positions between our species were identified, two of which (positions 63 and 66; cf. Table 3) with a potential impact on protein function. These sites are not located in known functional domains of the HSP 40 kDa, yet they may have an impact on the structure of the protein.  86 . However, all these rates are derived from distant taxa and-to the best of our knowledge-no rotifer-specific substitution rates are available so far. Hence, these analyses should be repeated when substitution rate estimates for our target species (B. calyciflorus s.s. and B. fernandoi) become available. In any case, we note that estimated divergence times within the freshwater B. calyciflorus species complex are in the same order of magnitude as in the marine B. plicatilis species complex (represented by B. plicatilis and B. manjavacas in our analysis; cf. Supplementary Fig. s8).

Hybridization event.
Our study revealed a natural hybridization event between B. fernandoi and B. calyciflorus s.s. individuals originating at one sampling location (a single pond): Among the specimens collected there, four exhibited B. fernandoi ITS1 alleles, but shared the same B. calyciflorus s.s COI haplotype.). This comprises-to the best of our knowledge-the first reported in-situ hybridization among these two species. The ability of the species within the B. calyciflorus species complex to hybridize has already been suggested by the widespread occurrence of mitonuclear discordance 23 . However, only recent studies under laboratory conditions observed hybridization between the most closely related species B. calyciflorus s.s. and B. elevatus 6,7 , but with a significantly lower intraspecific fertilization (i.e., pre-zygotic isolation) and higher dormant propagule mortality (i.e., post-zygotic isolation) 6 . Another study detected signs of hybridization between B. elevatus and B. dorcas 87 . The comparison of alleles/haplotypes (HSP, ITS1, COI) in our four hybrid specimens with other samples from the same location revealed the hybrids to be F 2 or a subsequent generation and that genes from B. calyciflorus s.s. introgressed into the local B. fernandoi gene pool. This is because all the introgressed specimens we detected in this study were homozygous for B. fernandoi alleles at the ITS1 locus. Furthermore, three detected hybrids carried B. fernandoi specific HSP 40 kDa alleles, while in one case a B. calyciflorus s.s. specific HSP 40 kDa allele occurred. This different genetic makeup among the hybrids suggests that they either have emerged from independent hybridization events or that the hybridization was sufficiently ancient to allow for repeated sexual recombination after the hybridization event. In any case, our findings demonstrate that B. calyciflorus s.s. and B. fernandoi can naturally hybridize at sites where they occur in sympatry. It was found in the marine B. plicatilis species complex that niche differentiation of very similar species (e.g., body size, biotic niches, competition abilities) is facilitated by their response to changing physical environments in combination with life and diapause history traits 16 . In the B. calyciflorus species complex, a previous study reported strongest pronounced differences in life history traits of B. fernandoi compared to B. calyciflorus s.s., B. elevatus and B. dorcas, such as prolonged egg and juvenile development times and an overall lower egg production rate and mictic ratio 32 . The adaptation to different temperature optima 34 , associated with differences in HSP 40 Da expression 35 and protein structure (this study), may lead to seasonal isolation of the two species. This illustrates the above example that the niche of very similar species can be determined by their response to abiotic environmental conditions. The differential niches (here preferred temperature) occupied by the respective species reflect divergent adaptation. This acts as a pre-zygotic isolation mechanism by facilitating differences in timing of reproduction and population growth based on distinct environmental cues (i.e., temperature, photoperiod) and thus maintaining the species boundaries. Such pre-zygotic isolation by season has also been observed among Daphnia species (reviewed in 88  www.nature.com/scientificreports/ mechanism may however be imperfect under certain environmental conditions, leading to occasional hybridization, as observed in our study.

Conclusion
We used transcriptome and expression data to infer a suitable candidate, the HSP 40 kDa gene, which may play a role in temperature adaptation in the Brachionus calyciflorus species complex. Species-specific alleles and fixed amino acid substitutions sites as well as positively selected sites located in functional domains of the protein were identified. The high number of detected alleles, the results of the branch site selection test, and the fixed sites indicate a divergent adaptation process with B. calyciflorus s.s. retaining ancestral features, from which B. fernandoi derived by positive directional selection. Furthermore, the study revealed descendants of an in-situ hybridization event between B. calyciflorus s.s. and B. fernandoi. This finding indicates that hybridization between those species is possible. We hypothesize that the temporally isolated niches they populate serve as pre-zygotic isolation mechanisms. This isolation may be imperfect under certain environmental conditions, yet hybridizations seem rare and seasonal niche separation generally prevents species boundaries from becoming blurred.

Materials and methods
Candidate gene selection. To identify candidate genes with a sequence divergence pattern related to differential temperature adaptation, previously published transcriptome data 35 (GenBank accessions number SRA: SRR10426055-76) was analysed using the following customized bioinformatic pipeline ( Supplementary Fig. s1): De novo transcriptome assemblies were used to generate a super transcript for each species using Trinity v. 2.11.0 gene splicer modeler 89 . This super transcript step was integrated to minimize redundancy in the data by inferring original genes and alternative splice forms 90 . A reciprocal best blast hit process was performed, using the blastn program of BLAST 2.9.0+ 91 to detect orthologous genes shared by B. calyciflorus s.s. and B. fernandoi.
To determine the best blast hit, results were filtered using a combination of E-value, which is a size corrected measure of statistical significance, and the bit score, which is a measure of matches and mismatches between two sequences 91 . When E-values of the first two or more hits were the same, the best hit was chosen using the bit score. For each search, a list was created that contained one best hit per query. Orthologous gene pairs were extracted, and open reading frames were predicted using TransDecoder v 5.5.0 92 . To calculate ratios of the rates of non-synonymous and synonymous substitutions (dN/dS) from protein-coding regions, sequences were then translated into amino acid sequences using Biopython 93 and locally aligned within the Biopython integrated package pairwise2 using a Smith-Waterman algorithm. Locally aligned coding sequences and the corresponding unaligned DNA sequences were used to create codon alignments using pal2nal 94 . Selection tests were conducted using all available seven (M0, M1a, M2, M3, M7, M8 and M8a) codeML models (Supplementary Table s1), as implemented in PAML v. 4.9 95 . These models where then compared in four (M0 vs. M3, M1a vs. M2, M7 vs. M8, M8 vs. M8a) predefined likelihood-ratio tests (LRT) to decide whether the different null models (i.e., models that do not allow for any codons ω > 1, corresponding to absence of positive selection) can be rejected 95 .
Orthologous genes that were significant (< 0.05) in all four LRT comparisons were annotated using Gene Ontology (GO) terms 96 . These terms are hierarchically ordered descriptions of genes or proteins molecular functions, biological processes, and cellular components. The GO term annotation was done on the online Server Argot2.5 and its batch processing function 97 . To use this function, the longest available amino acid sequence of each protein was used in a blast search against the Uniprot Swiss-Prot database 98 and a hmmer search 99,100 against the Pfam-A database 100 . In addition, we compared the genes exhibiting positive selection in the selection tests with those found to be differentially expressed relative to species and temperature (control: 20 °C; mild heat: B. calyciflorus s.s. 26 °C, B. fernandoi 23 °C; high heat: B. calyciflorus s.s. 32 °C, B. fernandoi 26 °C) 35 . Among those genes functionally related to temperature tolerance (GO:00009408; "response to heat"; GO:00009409; "response to cold"), only one orthologous gene, the HSP 40 kDa, exhibited signs of positive selection and was differentially expressed in more than one temperature category, hence chosen for in-depth analysis.  Fig. s2, Supplementary Table s2). To prevent that the DNA from single individuals is lost during the extraction process, 3 µL carrier RNA [1 µg/µL] (QIAGEN, Germany) was added to the sample during the lysis. DNA was extracted using the NucleoSpin®Tissue extraction kit (Macherey-Nagel, Germany) following the manufacturer's protocol for animal tissue (page [12][13][14]. Primer design HSP 40 kDa. To Fig. s3). Subsequently, to further compare the two formerly cryptic species of B. calyciflorus s.s. and B. fernandoi, a codon-based McDonald-Kreitman test (MKT) and Tajima's D statistics (allele frequency based) were calculated based on the unique inferred alleles using DNAsp v. 6. Species-specific fixed amino acid substitutions within the B. calyciflorus species complex were identified (Supplementary Fig. s5) and the ancestral amino acid was inferred under parsimony, using sequence information from B. rubens, B. angularis and B. diversicornis and the ITS1 phylogeny ( Supplementary Fig. s3). The functional effects of the specific amino acid (aa) substitutions were predicted with Polyphen2 119 using B. calyciflorus s.s. as a reference. Note here that Polyphen2 does not consider the possibility of a positive effect of an aa substitution, www.nature.com/scientificreports/ such that it ranks any substitution without a predicted change in protein function as "benign" (i.e., no effect), while any predicted change is ranked as "damaging" (i.e., predicted to impact protein function  figure s8). In absence of a published substitution rate for rotifers, the standard divergence rate for invertebrates/insects (2.3% Myr −1 ) with a constant substitution rate of 0.0115 per million years per lineages 86 was used to calibrate the phylogeny. Cds were aligned in Geneious v. 8.1.9 using ClustalW and the IQ-Tree webserver 129 was used to determine the best fitting substitution model. The BEAST run was conducted with 10 million MCMC iterations, with GTR + G4 + I chosen as the substitution model. Convergence was checked with Tracer v.1.7.2 130 , using the first million MCMC iterations as the burn-in value and tree was visualized using FigTree version 1.4.4. and adapted in Inkscape version 1.0.1.